Big Community Data before World Wide Web Era
نویسندگان
چکیده
This paper introduces the NIFTY-Serve corpus, a large data archive collected from Japanese discussion forums that operated via a Bulletin Board System (BBS) between 1987 and 2006. This corpus can be used in Artificial Intelligence researches such as Natural Language Processing, Community Analysis, and so on. The NIFTY-Serve corpus differs from data on WWW in three ways; (1) essentially spamand duplication-free because of strict data collection procedures, (2) historic user-generated data before WWW, and (3) a complete data set because the service now shut down. We also introduce some examples of use of the corpus. We plan to release this corpus to research institutes for research purpose. In order to use this corpus, please email to [email protected].
منابع مشابه
Exploration and Visualization in the Web of Big Linked Data: A Survey of the State of the Art
Data exploration and visualization systems are of great importance in the Big Data era. Exploring and visualizing very large datasets has become a major research challenge, of which scalability is a vital requirement. In this survey, we describe the major prerequisites and challenges that should be addressed by the modern exploration and visualization systems. Considering these challenges, we p...
متن کاملWe Are Boring
Despite all of these applications revolving around data, the database community appears to be content to cede these domains to our AI colleagues. This is absurdly short-sighted, and just as with the world-wide web, and (nearly) big data, we risk being an also-ran in the most significant trend in computer science in the coming decade. These smart systems will change the way we commute, work, and...
متن کاملSemantic Web technologies for the big data in life sciences.
The life sciences field is entering an era of big data with the breakthroughs of science and technology. More and more big data-related projects and activities are being performed in the world. Life sciences data generated by new technologies are continuing to grow in not only size but also variety and complexity, with great speed. To ensure that big data has a major influence in the life scien...
متن کاملSocial Network Analysis with Content and Graphs
As a consequence of changing economic and social realities, the increased availability of large-scale, real-world sociographic data has ushered in a new era of research and development in social network analysis. The quantity of content-based data created every day by traditional and social media, sensors, and mobile devices provides great opportunities and unique challenges for the automatic a...
متن کاملWhat a difference. A decade makes.
This was an era when the Internet really was the ‘Information Superhighway’. The early years of the digital age saw many organisations starting to move their services from using traditional to electronic forms of propagation. Part of this was to have a presence on the World Wide Web, initially resembling an online version of their sales brochure but later becoming more of a depository for speci...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016